
9 




09/700606 



2/8 



RLL 
WORKING 
BUFFER 



UPDATE 
CLUSTER 
MODEL 
PARAMETERS 



IDENTIFY 
DATA TO BE 
RETAINED OR 
COMPRESSED 



DATABASE 



v 10 






COMPRESS 




DATA > 




SUFFICIENT 




STATISTICS 



Fig.3 



RETURN FINAL 
CLUSTER 
MODEL 
PARAMETERS 



DATA 



12 



REQUEST 



DBMS 




DM 
ENGINE 




MODEL 
SUMMARIES 




VISUAL 
REPORTING 



APPLICATION 



14 



DATABASE 



l 10 



Fig.2 



INITIALIZE 



GET DATA 



DO 
EXTENDED 
CLUSTERING 



CALCULATE 
RS,DS,CS 



100 



110 



120 



130 




OUTPUT 
REPORT 



3/8 




Fig.5 





606 



160 162 



Fig.6A 

170 
\ 



3 



1 - 



Fig.6B 



180 



152 



SUM 



8UM 



4/8 



DISCARD SUMMARY 



(DS) 
SUMSO 



COMPRE8S DATA 



(C8) 
SUMSO 



VECTORS 



Fig.6C 

SUM 



n 



-164 



66 



□ mi 

□ M2 

□ M3 



174 



-154 



SUMSO 



Fig.6D 



n 



M 

□ 
□ 
□ 

□ 



156 158 

f m y . 

: n' 

: □ 

] MODEL 



□ 



£% 




09/700606 



5/8 



120 



V 



OLD MODEL- 
MODEL 



202 



204 



DETERMINE k,c,r AND SET 

MTOTAL-SUM of M(j) 
FOR EACH CLUSTER j-1,._K 



COMPUTE OLDMODEL 
MEAN + CV MATRIX 
FOR EACH CLUSTER 



206 




RESET (ZERO) 
NEWMODEL 



208 



GET FT 
FROM RS 



210 



212 



FOR EACH CLUSTERQ) 

a) FIND PROBABILITY OF DATA 
RECORD (POIND IN 
CLUSTER(j)-P(X|J) 

b) WEIGHTg)-[M(j)/MTOTAL>P(X|j) 

c) WEIGHTSUM- 
WEIGHTSUM+WEIGHTQ) 



NORMALIZE FOR EACH 
CLUSTER j 



214 



OUTERPROD- 
OUTERPFKXXPT.PT) 



216 



218 



FOR EACH CLUSTER UPDATE SUM, 
SUMSO, M, ATTRIBUTE/VALUE 
PROBABILITY TABLE 



FIND CENTER 
OF SUBCLU3TER 



230 



232 



FOR EACH CLUSTERQ) 

RND PROBABILITY OF 
a) SUBCLUSTER(CS) IN 

CLUSTER(|)-P(SUBCLUSTER|j) 

WEIGHTQ)- 
W [M(|)/MTOTALl«P(SUBCLUSTER|i) 

WEIGHTSUM- 
c) WEIGHTSUM+WEIGHTQ) 



NORMALIZE WEIGHT 
FOR EACH CLUSTER j 

WFIGHTfO- WEK3HTQ) 
WEIGHTQ) wgQHTSUM 



234 



FOR EACH CLUSTER UPDATE SUM, 
SUMSO, M, ATTRIBUTE/VALUE 
PROBABILITY TABLE 




238 



Fig.7A 




°9/ 700 6 06 



GET ELEMENT 
FROM DS 

RND CENTER 



FOR EACH CLUSTER(j) 

a) FIND PROBABILITY OF DS ELEMENT 
IN CLUSTER(j)-P(DS_ELEM/j) 

b) WEIGHT(j)-[M(J)/MTOTAL]»P(DS_ELEM/j) 

c) WEK3HTSUM-WEIGHTSUM+WEIGHT(j) 



252 



NORMALIZE WEIGHT 
FOR EACH CLUSTERCj) 

WE1GKT(j) " wekSttsum 



254 



FOR EACH CLUSTER UPDATE SUM, 
SUMSQ, M, ATTRIBUTE /VALUE 
PROBABILITY TABLE 



260 





• 



to) 




09/700606 



7/8 



160 



DI8CARD 8UMMARY 



(DS) 
SUMSO 



Fig.8A 



.164 



P D3 
\y PROBABILITY 



□ mi 

□ M2 

□ M3 



TABLES 



PROB #1 



PROB #2 



PROB *3 















□ Dmk 


PROB #K 


^ 162 


166 S 





170 



COMPRESS DATA 



(CS) 
SUMSO 




M 

□ 
□ 
□ 



CS 

PROBABILITY 
TABLES 



PROBtl 



PROB 42 



PROB #3 



180 



Fig.8B ^^172 



152 



Fig.8C 



2 - 



Fig.8D 



VECTORS 



176 



□ 



PROB #C 







-154 


SUMSO 


S~ M \ 


MODEL 
. PROBABILITY 
^ TABLES 










I I II I I 


PROB#1 




















□ □ 


PROB #2 




















n □ . 


PROB #3 










• 
• 
• 








• 
• 
• 


• 

m 
m 
















I □ 


PROB #K 



w 09/7006 

8/8 




MODEL PROBABILTTY CLUSTER #K 



ATTRIBUTE 
ID 


DISCRETE 
VALUE #1 


DISCRETE 
VALUE #1 


DISCHhlb 
VALUE #1 




DISCRETE 
VALUE « 


1 


PROB 
1/1 


PROB 
1/2 


PROB 
1/3 






2 












3 












• 
• 
• 












ATTRIBUTE 
j 


PROB 
j'/1 


PROB 
1/2 






PROB 

j/i 



Fig.9A 



SAMPLE PROBAHUTY TABLE CLUSTER *1 



ATTRIBUTE 




COLOR 


RED 
0.0875 


BLUE 
0.3001 


GREEN 
0.4374 


WHrTE 
0.1750 


STYLE 


SEDAN 
0.5627 


SPORT 
0.3500 


TRUCK 
0.0873 




SEX 


MALE 
0.2624 


FEMALE 
0.7376 





Fig.9B 



